Search Results

Documents authored by Will, Sebastian


Document
SparseRNAFolD: Sparse RNA Pseudoknot-Free Folding Including Dangles

Authors: Mateo Gray, Sebastian Will, and Hosna Jabbari

Published in: LIPIcs, Volume 273, 23rd International Workshop on Algorithms in Bioinformatics (WABI 2023)


Abstract
Motivation. Computational RNA secondary structure prediction by free energy minimization is indispensable for analyzing structural RNAs and their interactions. These methods find the structure with the minimum free energy (MFE) among exponentially many possible structures and have a restrictive time and space complexity (O(n³) time and O(n²) space for pseudoknot-free structures) for longer RNA sequences. Furthermore, accurate free energy calculations, including dangles contributions can be difficult and costly to implement, particularly when optimizing for time and space requirements. Results. Here we introduce a fast and efficient sparsified MFE pseudoknot-free structure prediction algorithm, SparseRNAFolD, that utilizes an accurate energy model that accounts for dangle contributions. While the sparsification technique was previously employed to improve the time and space complexity of a pseudoknot-free structure prediction method with a realistic energy model, SparseMFEFold, it was not extended to include dangle contributions due to the complexity of computation. This may come at the cost of prediction accuracy. In this work, we compare three different sparsified implementations for dangles contributions and provide pros and cons of each method. As well, we compare our algorithm to LinearFold, a linear time and space algorithm, where we find that in practice, SparseRNAFolD has lower memory consumption across all lengths of sequence and a faster time for lengths up to 1000 bases. Conclusion. Our SparseRNAFolD algorithm is an MFE-based algorithm that guarantees optimality of result and employs the most general energy model, including dangle contributions. We provide a basis for applying dangles to sparsified recursion in a pseudoknot-free model that has the ability to be extended to pseudoknots.

Cite as

Mateo Gray, Sebastian Will, and Hosna Jabbari. SparseRNAFolD: Sparse RNA Pseudoknot-Free Folding Including Dangles. In 23rd International Workshop on Algorithms in Bioinformatics (WABI 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 273, pp. 19:1-19:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{gray_et_al:LIPIcs.WABI.2023.19,
  author =	{Gray, Mateo and Will, Sebastian and Jabbari, Hosna},
  title =	{{SparseRNAFolD: Sparse RNA Pseudoknot-Free Folding Including Dangles}},
  booktitle =	{23rd International Workshop on Algorithms in Bioinformatics (WABI 2023)},
  pages =	{19:1--19:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-294-5},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{273},
  editor =	{Belazzougui, Djamal and Ouangraoua, A\"{i}da},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2023.19},
  URN =		{urn:nbn:de:0030-drops-186454},
  doi =		{10.4230/LIPIcs.WABI.2023.19},
  annote =	{Keywords: RNA, MFE, Secondary Structure Prediction, Dangle, Sparsification, Space Complexity, Time Complexity}
}
Document
Automated Design of Dynamic Programming Schemes for RNA Folding with Pseudoknots

Authors: Bertrand Marchand, Sebastian Will, Sarah J. Berkemer, Laurent Bulteau, and Yann Ponty

Published in: LIPIcs, Volume 242, 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022)


Abstract
Despite being a textbook application of dynamic programming (DP) and routine task in RNA structure analysis, RNA secondary structure prediction remains challenging whenever pseudoknots come into play. To circumvent the NP-hardness of energy minimization in realistic energy models, specialized algorithms have been proposed for restricted conformation classes that capture the most frequently observed configurations. While these methods rely on hand-crafted DP schemes, we generalize and fully automatize the design of DP pseudoknot prediction algorithms. We formalize the problem of designing DP algorithms for an (infinite) class of conformations, modeled by (a finite number of) fatgraphs, and automatically build DP schemes minimizing their algorithmic complexity. We propose an algorithm for the problem, based on the tree-decomposition of a well-chosen representative structure, which we simplify and reinterpret as a DP scheme. The algorithm is fixed-parameter tractable for the tree-width tw of the fatgraph, and its output represents a 𝒪(n^{tw+1}) algorithm for predicting the MFE folding of an RNA of length n. Our general framework supports general energy models, partition function computations, recursive substructures and partial folding, and could pave the way for algebraic dynamic programming beyond the context-free case.

Cite as

Bertrand Marchand, Sebastian Will, Sarah J. Berkemer, Laurent Bulteau, and Yann Ponty. Automated Design of Dynamic Programming Schemes for RNA Folding with Pseudoknots. In 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 242, pp. 7:1-7:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{marchand_et_al:LIPIcs.WABI.2022.7,
  author =	{Marchand, Bertrand and Will, Sebastian and Berkemer, Sarah J. and Bulteau, Laurent and Ponty, Yann},
  title =	{{Automated Design of Dynamic Programming Schemes for RNA Folding with Pseudoknots}},
  booktitle =	{22nd International Workshop on Algorithms in Bioinformatics (WABI 2022)},
  pages =	{7:1--7:24},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-243-3},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{242},
  editor =	{Boucher, Christina and Rahmann, Sven},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2022.7},
  URN =		{urn:nbn:de:0030-drops-170414},
  doi =		{10.4230/LIPIcs.WABI.2022.7},
  annote =	{Keywords: RNA folding, treewidth, dynamic programming}
}
Document
Fast and Accurate Structure Probability Estimation for Simultaneous Alignment and Folding of RNAs

Authors: Milad Miladi, Martin Raden, Sebastian Will, and Rolf Backofen

Published in: LIPIcs, Volume 143, 19th International Workshop on Algorithms in Bioinformatics (WABI 2019)


Abstract
Motivation: Simultaneous alignment and folding (SA&F) of RNAs is the indispensable gold standard for inferring the structure of non-coding RNAs and their general analysis. The original algorithm, proposed by Sankoff, solves the theoretical problem exactly with a complexity of O(n^6) in the full energy model. Over the last two decades, several variants and improvements of the Sankoff algorithm have been proposed to reduce its extreme complexity by proposing simplified energy models or imposing restrictions on the predicted alignments. Results: Here we introduce a novel variant of Sankoff’s algorithm that reconciles the simplifications of PMcomp, namely moving from the full energy model to a simpler base pair-based model, with the accuracy of the loop-based full energy model. Instead of estimating pseudo-energies from unconditional base pair probabilities, our model calculates energies from conditional base pair probabilities that allow to accurately capture structure probabilities, which obey a conditional dependency. Supporting modifications with surgical precision, this model gives rise to the fast and highly accurate novel algorithm Pankov (Probabilistic Sankoff-like simultaneous alignment and folding of RNAs inspired by Markov chains). Pankov benefits from the speed-up of excluding unreliable base-pairing without compromising the loop-based free energy model of the Sankoff’s algorithm. We show that Pankov outperforms its predecessors LocARNA and SPARSE in folding quality and is faster than LocARNA. Pankov is developed as a branch of the LocARNA package and available at https://github.com/mmiladi/Pankov.

Cite as

Milad Miladi, Martin Raden, Sebastian Will, and Rolf Backofen. Fast and Accurate Structure Probability Estimation for Simultaneous Alignment and Folding of RNAs. In 19th International Workshop on Algorithms in Bioinformatics (WABI 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 143, pp. 14:1-14:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{miladi_et_al:LIPIcs.WABI.2019.14,
  author =	{Miladi, Milad and Raden, Martin and Will, Sebastian and Backofen, Rolf},
  title =	{{Fast and Accurate Structure Probability Estimation for Simultaneous Alignment and Folding of RNAs}},
  booktitle =	{19th International Workshop on Algorithms in Bioinformatics (WABI 2019)},
  pages =	{14:1--14:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-123-8},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{143},
  editor =	{Huber, Katharina T. and Gusfield, Dan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2019.14},
  URN =		{urn:nbn:de:0030-drops-110446},
  doi =		{10.4230/LIPIcs.WABI.2019.14},
  annote =	{Keywords: RNA secondary structure, Structural bioinformatics, Alignment, Algorithms}
}
Document
Sparsification Enables Predicting Kissing Hairpin Pseudoknot Structures of Long RNAs in Practice

Authors: Hosna Jabbari, Ian Wark, Carlo Montemagno, and Sebastian Will

Published in: LIPIcs, Volume 88, 17th International Workshop on Algorithms in Bioinformatics (WABI 2017)


Abstract
While computational RNA secondary structure prediction is an important tool in RNA research, it is still fundamentally limited to pseudoknot-free structures (or at best very simple pseudoknots) in practice. Here, we make the prediction of complex pseudoknots - including kissing hairpin structures - practically applicable by reducing the originally high space consumption. For this aim, we apply the technique of sparsification and other space-saving modifications to the recurrences of the pseudoknot prediction algorithm by Chen, Condon and Jabbari (CCJ algorithm). Thus, the theoretical space complexity of free energy minimization is reduced to Theta(n^3+Z), in the sequence length n and the number of non-optimally decomposable fragments ("candidates") Z. The sparsified CCJ algorithm, sparseCCJ, is presented in detail. Moreover, we provide and compare three generations of CCJ implementations, which continuously improve the space requirements: the original CCJ implementation, our first modified implementation, and our final sparsified implementation. The two latest implementations implement the established HotKnots DP09 energy model. In our experiments, using 244GB of RAM, the original CCJ implementation failed to handle sequences longer than 195 bases; sparseCCJ handles our pseudoknot data set (up to about length 400 bases) in this space limit. All three CCJ implementations are available at https://github.com/HosnaJabbari/CCJ.

Cite as

Hosna Jabbari, Ian Wark, Carlo Montemagno, and Sebastian Will. Sparsification Enables Predicting Kissing Hairpin Pseudoknot Structures of Long RNAs in Practice. In 17th International Workshop on Algorithms in Bioinformatics (WABI 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 88, pp. 12:1-12:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{jabbari_et_al:LIPIcs.WABI.2017.12,
  author =	{Jabbari, Hosna and Wark, Ian and Montemagno, Carlo and Will, Sebastian},
  title =	{{Sparsification Enables Predicting Kissing Hairpin Pseudoknot Structures of Long RNAs in Practice}},
  booktitle =	{17th International Workshop on Algorithms in Bioinformatics (WABI 2017)},
  pages =	{12:1--12:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-050-7},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{88},
  editor =	{Schwartz, Russell and Reinert, Knut},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2017.12},
  URN =		{urn:nbn:de:0030-drops-76408},
  doi =		{10.4230/LIPIcs.WABI.2017.12},
  annote =	{Keywords: RNA, secondary structure prediction, pseudoknots, space efficiency, sparsification}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail